Automatic Detection of Comma Splices
نویسندگان
چکیده
In English text, independent clauses should be demarcated with full-stops (periods), or linked together with conjunctions. Non-native speakers are often prone to linking them improperly with commas instead of conjunctions, producing comma splices. This paper describes a method to detect comma splices using Conditional Random Fields (CRF), with features derived from parse tree patterns. In experiments, our model achieved an average of 0.91 precision and 0.28 recall in detecting comma splices, significantly outperforming both a baseline model using only local features and a widely used commercial grammar checker.
منابع مشابه
Reducing Light Change Effects in Automatic Road Detection
Automatic road extraction from aerial images can be very helpful in traffic control and vehicle guidance systems. Most of the road detection approaches are based on image segmentation algorithms. Color-based segmentation is very sensitive to light changes and consequently the change of weather condition affects the recognition rate of road detection systems. In order to reduce the light change ...
متن کاملReducing Light Change Effects in Automatic Road Detection
Automatic road extraction from aerial images can be very helpful in traffic control and vehicle guidance systems. Most of the road detection approaches are based on image segmentation algorithms. Color-based segmentation is very sensitive to light changes and consequently the change of weather condition affects the recognition rate of road detection systems. In order to reduce the light change ...
متن کاملUsing Machine Learning Algorithms for Automatic Cyber Bullying Detection in Arabic Social Media
Social media allows people interact to express their thoughts or feelings about different subjects. However, some of users may write offensive twits to other via social media which known as cyber bullying. Successful prevention depends on automatically detecting malicious messages. Automatic detection of bullying in the text of social media by analyzing the text "twits" via one of the machine l...
متن کاملAutomatic Comma Insertion of Lecture Transcripts Based on Multiple Annotations
To enhance readability and usability of speech recognition results, automatic punctuation is an essential process. In this paper, we address automatic comma prediction based on conditional random fields (CRF) using lexical, syntactic and pause information. Since there is large disagreement in comma insertion between humans, we model individual tendencies of punctuation using annotations given b...
متن کاملDetecting Commas in Slovak Legal Texts
This paper reports on initial experiments with automatic comma recovery in legal texts. In deciding whether to insert a comma or not, we propose to use the value of the probability of a bigram of two words without a comma and a trigram of the words with the comma. The probability is determined by the language model trained on sentences with commas labeled as separate words. In the training data...
متن کامل